Misfolding Dominates Protein Evolution
نویسندگان
چکیده
The diverse array of protein functions depends upon these molecules’ reliable ability to fold into the native structures determined by their amino-acid sequences. Because mutations that alter a protein’s sequence frequently disrupt its folding, protein evolution explores protein sequence space conservatively, either by point mutations or recombination between related sequences. Attempts to engineer proteins by co-opting the evolutionary algorithm have also largely proceeded by the stepwise accumulation of beneficial mutations. Other strategies for directed evolution have focused on introducing many mutations at once as a way to increase the likelihood of finding improved variants, attempting to balance higher mutational diversity with lower retention of folding. Using simple models, I explore this tradeoff and find that protein misfolding dominates whether increasing mutation levels increase the number of improved variants. I analyze results of a popular mutagenesis protocol, error-prone PCR, for evidence that coupling between mutations might favor higher mutation levels, as claimed by several groups. A comparison of high-mutation-rate mutagenesis to protein recombination between distantly related proteins reveals qualitative differences in protein tolerance for sequence changes introduced by each method. Mutational tolerance may also be reflected in the rate at which proteins accumulate sequence changes over evolutionary time; why proteins evolve at different rates remains a major open question in biology. An analysis of rate determinants suggests that one major variable, linked to how highly expressed the encoding gene is, dominates the rate of yeast protein evolution. To explain this trend, I hypothesize that proteins are selected to fold properly despite mistranslation, a property I call translational robustness, and test it using genomic data. To examine protein evolution at a higher level of detail, a large-scale simulation is constructed in which simulated organisms, with genomes containing genes expressing computationally foldable proteins at different levels, evolve over millions of generations with protein misfolding imposing the only fitness cost. The results suggest that protein misfolding suffices to explain many significant trends in genome evolution
منابع مشابه
Impact of translational error-induced and error-free misfolding on the rate of protein evolution
What determines the rate of protein evolution is a fundamental question in biology. Recent genomic studies revealed a surprisingly strong anticorrelation between the expression level of a protein and its rate of sequence evolution. This observation is currently explained by the translational robustness hypothesis in which the toxicity of translational error-induced protein misfolding selects fo...
متن کاملPoint mutations in protein globular domains: contributions from function, stability and misfolding.
Several contrasting hypotheses have been formulated about the influence of functional and conformational properties, like stability and avoidance of misfolding, on the evolution of protein globular domains. Selection at functional sites has been suggested to be detrimental to stability or coupled to it. Avoidance of misfolding may be achieved by discarding misfolding-prone sequences or by maint...
متن کاملTransient misfolding dominates multidomain protein folding
Neighbouring domains of multidomain proteins with homologous tandem repeats have divergent sequences, probably as a result of evolutionary pressure to avoid misfolding and aggregation, particularly at the high cellular protein concentrations. Here we combine microfluidic-mixing single-molecule kinetics, ensemble experiments and molecular simulations to investigate how misfolding between the imm...
متن کاملStability constraints and protein evolution: the role of chain length, composition and disulfide bonds.
Stability of the native state is an essential requirement in protein evolution and design. Here we investigated the interplay between chain length and stability constraints using a simple model of protein folding and a statistical study of the Protein Data Bank. We distinguish two types of stability of the native state: with respect to the unfolded state (unfolding stability) and with respect t...
متن کاملNegative Correlation between Expression Level and Evolutionary Rate of Long Intergenic Noncoding RNAs
Mammalian genomes contain numerous genes for long noncoding RNAs (lncRNAs). The functions of the lncRNAs remain largely unknown but their evolution appears to be constrained by purifying selection, albeit relatively weakly. To gain insights into the mode of evolution and the functional range of the lncRNA, they can be compared with much better characterized protein-coding genes. The evolutionar...
متن کامل